4 research outputs found
Towards a better labeling process for network security datasets
Most network security datasets do not have comprehensive label assignment
criteria, hindering the evaluation of the datasets, the training of models, the
results obtained, the comparison with other methods, and the evaluation in
real-life scenarios. There is no labeling ontology nor tools to help assign the
labels, resulting in most analyzed datasets assigning labels in files or
directory names. This paper addresses the problem of having a better labeling
process by (i) reviewing the needs of stakeholders of the datasets, from
creators to model users, (ii) presenting a new ontology of label assignment,
(iii) presenting a new tool for assigning structured labels for Zeek network
flows based on the ontology, and (iv) studying the differences between
generating labels and consuming labels in real-life scenarios. We conclude that
a process for structured label assignment is paramount for advancing research
in network security and that the new ontology-based label assignation rules
should be published as an artifact of every dataset
Attacker Profiling Through Analysis of Attack Patterns in Geographically Distributed Honeypots
Honeypots are a well-known and widely used technology in the cybersecurity
community, where it is assumed that placing honeypots in different geographical
locations provides better visibility and increases effectiveness. However, how
geolocation affects the usefulness of honeypots is not well-studied, especially
for threat intelligence as early warning systems. This paper examines attack
patterns in a large public dataset of geographically distributed honeypots by
answering methodological questions and creating behavioural profiles of
attackers. Results show that the location of honeypots helps identify attack
patterns and build profiles for the attackers. We conclude that not all the
intelligence collected from geographically distributed honeypots is equally
valuable and that a good early warning system against resourceful attackers may
be built with only two distributed honeypots and a production server
LLM in the Shell: Generative Honeypots
Honeypots are essential tools in cybersecurity. However, most of them (even
the high-interaction ones) lack the required realism to engage and fool human
attackers. This limitation makes them easily discernible, hindering their
effectiveness. This work introduces a novel method to create dynamic and
realistic software honeypots based on Large Language Models. Preliminary
results indicate that LLMs can create credible and dynamic honeypots capable of
addressing important limitations of previous honeypots, such as deterministic
responses, lack of adaptability, etc. We evaluated the realism of each command
by conducting an experiment with human attackers who needed to say if the
answer from the honeypot was fake or not. Our proposed honeypot, called shelLM,
reached an accuracy rate of 0.92.Comment: 5 pages. 1 figure 1 tabl
Make It Count: an Analysis of a Brute-forcing Botnet
The smallest element in a botnet is a bot. The behavior of a bot can change dynamically based on the decision of the botmaster. Commonly driven by profit, bots are expected to be profitable. If an infected bot does not fulfill the expectations, the botmaster can instruct the bot to switch it's behavior to serve a better purpose. This paper presents a detailed analysis of a network traffic capture of a machine originally infected by a Gamarue variant. The analysis will uncover the behavior of the bot since the initial infection, inactivity period, delivery of a new payload and the following switch of behavior of the bot. The paper will analyze the infection in detail, including the horizontal brute-forcing activity affecting thousands of WordPress websites. The goal of the paper is to show a concrete example of a bot performing brute-forcing, analyze it, identify the mechanisms used and indicators of compromise that will help detect it